Today’s Objectives

Cleft Lip and Palate 1/3

Cleft lip and cleft palate (CLP) are splits in the upper lip, the roof of the mouth (palate) or both. They result when facial structures that are developing in an unborn baby do not close completely. CLP is one of the most common birth defects with a frequency of 1/700 live births.

Cleft lip and palate

Cleft lip and palate

Cleft Lip and Palate 2/3

Children with cleft lip with or without cleft palate face a variety of challenges, depending on the type and severity of the cleft.

Reference: Mayo Foundation for Medical Education and Research

Cleft Lip and Palate 3/3

Reference: Hum Mol Genet. 2014 May 15; 23(10): 2711–2720

Question

Given:

  1. The pathogenic variant in IRF6 exists in only 70% of the VWS families

  2. IRF6 is a transcription factor

How can we identify other genes that might be involved in the remaining 30% of the VWS families?

Hint

Hypothesis

Why Microarray?

Original Paper

PMID: 17041601

PMID: 17041601

Experimental Design

Dataset

ID KO1 KO2 KO3 WT1 WT2 WT3
1415670_at 6531.0 5562.8 6822.4 7732.1 7191.2 7551.9
1415671_at 11486.3 10542.7 10641.4 10408.2 9484.5 7650.2
1415672_at 14339.2 13526.1 14444.7 12936.6 13841.7 13285.7
1415673_at 3156.8 2219.5 3264.4 2374.2 2201.8 2525.3

Loading

We are going to load the dataset from a tsv file (Irf6.tsv) into a variable called data using function read.table.

data here is just an arbitrary varilable name to hold the result of read.table and it can be called/named almost anything. See The State of Naming Conventions in R (Bååth 2012) for more information on naming varilables in R.

# Load data from text file into a varilable
data = as.matrix(read.table("Irf6.tsv", header = TRUE, row.names = 1))

Note: the hash sign (#) indicates that what comes after is a comment. Comments are for documentation and readability of the R code and they are not evaluated (or executed).

Checking

dim(data) # Dimension of the dataset
## [1] 45101     6
head(data) # First few rows
KO1 KO2 KO3 WT1 WT2 WT3
1415670_at 6531.0 5562.8 6822.4 7732.1 7191.2 7551.9
1415671_at 11486.3 10542.7 10641.4 10408.2 9484.5 7650.2
1415672_at 14339.2 13526.1 14444.7 12936.6 13841.7 13285.7
1415673_at 3156.8 2219.5 3264.4 2374.2 2201.8 2525.3
1415674_a_at 4002.0 3306.9 3777.0 3760.6 3137.0 2911.5
1415675_at 3468.4 3347.4 3332.9 3073.5 3046.0 2914.4

Exploring

Check the behavior of the data (e.g., normal?, skewed?)

hist(data, col = "gray", main="Histogram")

Transforming

\(log_2\) transformation (why?)

data2 = log2(data)
hist(data2, col = "gray", main="Histogram")

Multiple Plots 1/2

samples = colnames(data2) # Headers (names) of the columns
samples
## [1] "KO1" "KO2" "KO3" "WT1" "WT2" "WT3"
par( mfrow = c( 2, 3 ) ) # Split screen into 2 rows x 3 columns paritions

for (i in 1:3) {
  # for each of the first 3 columns in the table
  hist(data2[,i], col = "red", main = samples[i])
}

for (i in 4:6) {
  # for each of the last 3 columns in the table
  hist(data2[,i], col = "green", main = samples[i])
}

Multiple Plots 2/2

par( mfrow = c( 1, 1 ) ) # Just to set screen back to 1 partition

Boxplot

colors = c(rep("red", 3), rep("green", 3))
boxplot(data2, col = colors, las = 2)

Clustering 1/2

Hierarchical clustering of the samples (i.e., columns) based on the correlation coefficients of the expression values

hc = hclust(as.dist(1 - cor(data2)))
plot(hc)

Clustering 2/2

To learn about a function (e.g., hclust), you may type ?function (e.g., ?hclust) in the console to launch R documentation on that function:

Comparing

We are going to compare the means of the replicates of the two conditions

# Compute the means of the samples of each condition
ko = apply(data2[, 1:3], 1, mean)
head(ko)
##   1415670_at   1415671_at   1415672_at   1415673_at 1415674_a_at 
##     12.61692     13.40966     13.78313     11.47096     11.84693 
##   1415675_at 
##     11.72381
wt = apply(data2[, 4:6], 1, mean)
head(wt)
##   1415670_at   1415671_at   1415672_at   1415673_at 1415674_a_at 
##     12.87043     13.15269     13.70450     11.20664     11.66649 
##   1415675_at 
##     11.55578

Scatter 1/2

Scatter 2/2

pairs(data2) # All pairwise comparisons

Differentially Expressed Genes (DEGs)

To identify DEGs, we will identify:

Then, we will take the overlap (intersection) of the two sets

Biological Significance (fold-change) 1/2

fold = ko - wt # Difference between means
head(fold)
##   1415670_at   1415671_at   1415672_at   1415673_at 1415674_a_at 
##  -0.25351267   0.25697097   0.07863227   0.26431191   0.18044345 
##   1415675_at 
##   0.16803065
  • +ve \(\rightarrow\) Up-regulation \(\uparrow\)
  • -ve \(\rightarrow\) Down-regulation \(\downarrow\)

Biological Significance (fold-change) 2/2

hist(fold, col = "gray") # Histogram of the fold

Statistical Significance (p-value) 1/3

t-test

Let’s say there are two samples x and y from the two populations, X and Y, respectively, to determine whether the means of two populations are significantly different:

x = c(1, 3, 5, 7, 9)
y = c(2, 4, 6, 8, 10)
t.test(x, y)
## 
##  Welch Two Sample t-test
## 
## data:  x and y
## t = -0.5, df = 8, p-value = 0.6305
## alternative hypothesis: true difference in means is not equal to 0
## 95 percent confidence interval:
##  -5.612008  3.612008
## sample estimates:
## mean of x mean of y 
##         5         6

Case A

broom::tidy(t.test(x, y))
estimate estimate1 estimate2 statistic p.value parameter conf.low conf.high method alternative
0.3419216 -0.3426252 -0.6845468 0.7582038 0.4583316 17.65468 -0.6068459 1.290689 Welch Two Sample t-test two.sided

Case B

broom::tidy(t.test(x, y))
estimate estimate1 estimate2 statistic p.value parameter conf.low conf.high method alternative
4.124747 1.768308 -2.356439 8.052339 3e-07 17.5567 3.046615 5.202879 Welch Two Sample t-test two.sided

Statistical Significance (p-value) 2/3

pvalue = NULL # Empty list for the p-values
n = nrow(data) # Number of genes (rows)
n
## [1] 45101
for(i in 1 : n) {
  # for each gene
  x = data2[i, 1:3] # wt values of gene number i
  y = data2[i, 4:6] # ko values of gene number i
  t = t.test(x, y) # t-test between the two conditions
  pvalue[i] = t$p.value # Put current p-value in the list of p-values
}
head(pvalue)
## [1] 0.092706280 0.182663337 0.129779075 0.272899180 0.262377176 0.005947807

Statistical Significance (p-value) 3/3

hist(-log10(pvalue), col = "gray") # Histogram of p-values (-log10)

Volcano : Statistical & Biological 1/3

plot(-log10(pvalue) ~ fold)

Volcano : Statistical & Biological 2/3

fold_cutoff = 2
pvalue_cutoff = 0.01

plot(-log10(pvalue) ~ fold)

abline(v = fold_cutoff, col = "blue", lwd = 3)
abline(v = -fold_cutoff, col = "red", lwd = 3)
abline(h = -log10(pvalue_cutoff), col = "green", lwd = 3)

Volcano : Statistical & Biological 3/3

Filtering for DEGs 1/3

filter_by_fold = abs(fold) >= fold_cutoff # Biological
sum(filter_by_fold) # Number of genes staisfy the condition
## [1] 1051
filter_by_pvalue = pvalue <= pvalue_cutoff # Statistical
sum(filter_by_pvalue)
## [1] 1564
filter_combined = filter_by_fold & filter_by_pvalue # Combined
sum(filter_combined)
## [1] 276

Filtering for DEGs 2/3

filtered = data2[filter_combined, ]
dim(filtered)
## [1] 276   6
head(filtered)
KO1 KO2 KO3 WT1 WT2 WT3
1416200_at 13.312004 12.973357 12.868456 7.40429 8.558803 8.683696
1416236_a_at 14.148397 14.039236 14.130007 12.23604 12.022403 11.495055
1417808_at 5.321928 5.442944 4.053111 15.16978 15.070087 14.753274
1417932_at 10.602884 10.257152 10.496055 13.98445 14.203295 13.720960
1418050_at 10.622052 10.975490 10.795066 12.86513 13.012048 12.658122
1418100_at 9.117903 8.634811 9.057721 12.90358 12.842449 12.233769

Filtering for DEGs 3/3

plot(-log10(pvalue) ~ fold)
points(-log10(pvalue[filter_combined]) ~ fold[filter_combined],
       col = "green")

Exercise

On the volcano plot, highlight the up-regulated genes in red and the download-regulated genes in blue

Solution 1/2

# Screen for the up-regulated genes (+ve fold)
filter_up = filter_combined & fold > 0

head(filter_up)
##   1415670_at   1415671_at   1415672_at   1415673_at 1415674_a_at 
##        FALSE        FALSE        FALSE        FALSE        FALSE 
##   1415675_at 
##        FALSE
# Number of filtered genes
sum(filter_up)
## [1] 95
# Screen for the down-regulated genes (-ve fold)
filter_down = filter_combined & fold < 0

head(filter_down)
##   1415670_at   1415671_at   1415672_at   1415673_at 1415674_a_at 
##        FALSE        FALSE        FALSE        FALSE        FALSE 
##   1415675_at 
##        FALSE
# Number of filtered genes
sum(filter_down)
## [1] 181

Solution 2/2

plot(-log10(pvalue) ~ fold)
points(-log10(pvalue[filter_up]) ~ fold[filter_up], col = "red")
points(-log10(pvalue[filter_down]) ~ fold[filter_down], col = "blue")

Heatmap 1/5

heatmap(filtered)

Heatmap 2/5

Heatmap 3/5

# Clustering of the columns (samples)
col_dendrogram = as.dendrogram(hclust(as.dist(1-cor(filtered))))
plot(col_dendrogram)

Heatmap 4/5

# Clustering of the rows (genes)
row_dendrogram = as.dendrogram(hclust(as.dist(1-cor(t(filtered)))))
plot(row_dendrogram)

Heatmap 5/5

# Heatmap with the rows and columns clustered by correlation coefficients
heatmap(filtered, Rowv=row_dendrogram, Colv=col_dendrogram)

Enhanced Heatmap 1/4

There is an enhanced heatmap function heatmap.2 that comes with the gplots package. heatmap.2 provides more options via a richer set of parameters.

install.packages("gplots") # Install the package (if not already installed)
library(gplots) # Load the package
## 
## Attaching package: 'gplots'
## The following object is masked from 'package:stats':
## 
##     lowess

Enhanced Heatmap 2/4

heatmap.2(filtered, Rowv=row_dendrogram, Colv=col_dendrogram)

Enhanced Heatmap 3/4

heatmap.2(filtered, Rowv=row_dendrogram, Colv=col_dendrogram,
          col = bluered(256))

Enhanced Heatmap 4/4

heatmap.2(filtered, Rowv=row_dendrogram, Colv=col_dendrogram,
          col = bluered(256), scale = "row")

Annnotation 1/3

To obtain the functional annotation of the differentially expressed genes, we are going first to extract their probe ids:

filterd_ids = row.names(filtered) # ids of the filtered DE genes
length(filterd_ids)
## [1] 276
head(filterd_ids)
## [1] "1416200_at"   "1416236_a_at" "1417808_at"   "1417932_at"  
## [5] "1418050_at"   "1418100_at"

Then we can obtain the annotation online via The Database for Annotation, Visualization and Integrated Discovery DAVID.

write.table(filterd_ids, file = "filterd_ids.txt", quote = FALSE)

Annnotation 2/3

Alternatively, we can generate a comprehensive functional annotation via BioConductor packages annaffy and mouse4302.db.

To install BioConductor packages (if they are not already installed):

source("http://bioconductor.org/biocLite.R")
biocLite(c("annaffy", "mouse4302.db"))

Load the packages and extract annotation of the filtered ids:

library(annaffy)
library(hgu133plus2.db)
annotation_table = aafTableAnn(filterd_ids, "mouse4302.db")
saveHTML(annotation_table, file="filtered.html")
browseURL("filtered.html")
## Loading required package: Biobase
## Loading required package: BiocGenerics
## Loading required package: parallel
## 
## Attaching package: 'BiocGenerics'
## The following objects are masked from 'package:parallel':
## 
##     clusterApply, clusterApplyLB, clusterCall, clusterEvalQ,
##     clusterExport, clusterMap, parApply, parCapply, parLapply,
##     parLapplyLB, parRapply, parSapply, parSapplyLB
## The following objects are masked from 'package:dplyr':
## 
##     combine, intersect, setdiff, union
## The following objects are masked from 'package:stats':
## 
##     IQR, mad, sd, var, xtabs
## The following objects are masked from 'package:base':
## 
##     anyDuplicated, append, as.data.frame, cbind, colMeans,
##     colnames, colSums, do.call, duplicated, eval, evalq, Filter,
##     Find, get, grep, grepl, intersect, is.unsorted, lapply,
##     lengths, Map, mapply, match, mget, order, paste, pmax,
##     pmax.int, pmin, pmin.int, Position, rank, rbind, Reduce,
##     rowMeans, rownames, rowSums, sapply, setdiff, sort, table,
##     tapply, union, unique, unsplit, which, which.max, which.min
## Welcome to Bioconductor
## 
##     Vignettes contain introductory material; view with
##     'browseVignettes()'. To cite Bioconductor, see
##     'citation("Biobase")', and for packages 'citation("pkgname")'.
## Loading required package: GO.db
## Loading required package: AnnotationDbi
## Loading required package: stats4
## Loading required package: IRanges
## Loading required package: S4Vectors
## 
## Attaching package: 'S4Vectors'
## The following object is masked from 'package:gplots':
## 
##     space
## The following objects are masked from 'package:dplyr':
## 
##     first, rename
## The following object is masked from 'package:base':
## 
##     expand.grid
## 
## Attaching package: 'IRanges'
## The following objects are masked from 'package:dplyr':
## 
##     collapse, desc, slice
## 
## Attaching package: 'AnnotationDbi'
## The following object is masked from 'package:dplyr':
## 
##     select
## 
## Loading required package: KEGG.db
## 
## KEGG.db contains mappings based on older data because the original
##   resource was removed from the the public domain before the most
##   recent update was produced. This package should now be
##   considered deprecated and future versions of Bioconductor may
##   not have it available.  Users who want more current data are
##   encouraged to look at the KEGGREST or reactome.db packages
## Loading required package: org.Hs.eg.db
## 
## 
## Loading required package: mouse4302.db
## Loading required package: org.Mm.eg.db
## 
## 
## Warning in chkPkgs(chip): The mouse4302.db package does not appear to
## contain annotation data.
## Warning in rsqlite_fetch(res@ptr, n = n): Don't need to call dbFetch() for
## statements, only for queries

## Warning in rsqlite_fetch(res@ptr, n = n): Don't need to call dbFetch() for
## statements, only for queries
## An object of class "aafTable"
## Slot "probeids":
## [1] "1416200_at"
## 
## Slot "table":
## $Probe
## An object of class "aafList"
## [[1]]
## An object of class "aafProbe"
## [1] "1416200_at"
## 
## 
## $Symbol
## An object of class "aafList"
## [[1]]
## [1] "Il33"
## attr(,"class")
## [1] "aafSymbol"
## 
## 
## $Description
## An object of class "aafList"
## [[1]]
## [1] "interleukin 33"
## attr(,"class")
## [1] "aafDescription"
## 
## 
## $Chromosome
## An object of class "aafList"
## [[1]]
## [1] "19"
## attr(,"class")
## [1] "aafChromosome"
## 
## 
## $`Chromosome Location`
## An object of class "aafList"
## [[1]]
## An object of class "aafChromLoc"
## [1] 29945789 29925113
## 
## 
## $GenBank
## An object of class "aafList"
## [[1]]
## [1] "NM_133775"
## attr(,"class")
## [1] "aafGenBank"
## 
## 
## $Gene
## An object of class "aafList"
## [[1]]
## An object of class "aafLocusLink"
## [1] 77125
## 
## 
## $UniGene
## An object of class "aafList"
## [[1]]
## [1] "Mm.182359"
## attr(,"class")
## [1] "aafUniGene"
## 
## 
## $PubMed
## An object of class "aafList"
## [[1]]
## An object of class "aafPubMed"
##   [1]  8889548 10349636 11042159 11076861 11217851 12466851 12477932
##   [8] 12819012 15475267 15489334 16141072 16141073 16286016 16602821
##  [15] 17185418 17492053 17623648 17675517 17881510 18003919 18023358
##  [22] 18250453 18268038 18450470 18552204 18566365 18603409 18667700
##  [29] 18799693 18802081 19248109 19451398 19465481 19508382 19553541
##  [36] 19559631 19661270 19666510 19684081 19750479 19841166 19892870
##  [43] 19933859 19950183 20035719 20042577 20385815 20412815 20427273
##  [50] 20501612 20534524 20634488 20689814 20693421 20937871 20939024
##  [57] 21190867 21220696 21239713 21239718 21267068 21268000 21281751
##  [64] 21308681 21349253 21357533 21422152 21454686 21469105 21494550
##  [71] 21515798 21557213 21642589 21646790 21677750 21703183 21703403
##  [78] 21734074 21797940 21835205 21887788 21945667 21949025 21949094
##  [85] 21972019 22013230 22013485 22119406 22142849 22180658 22198948
##  [92] 22258632 22267218 22270365 22294690 22307629 22323740 22329990
##  [99] 22331917 22370606 22371395 22460070 22542450 22585447 22634619
## [106] 22660580 22661085 22665806 22686327 22689946 22702477 22782692
## [113] 22802353 22922818 23006545 23071771 23093619 23132931 23148283
## [120] 23162017 23169007 23248269 23300625 23323935 23324173 23347081
## [127] 23363980 23397250 23403558 23418608 23496815 23499895 23523996
## [134] 23547117 23582173 23585480 23630360 23662055 23683462 23733876
## [141] 23810766 23837438 23892028 23894196 23911389 23918359 23945235
## [148] 23954132 23960191 24028396 24043894 24045639 24058536 24076135
## [155] 24076431 24105680 24194600 24205109 24220317 24257755 24356538
## [162] 24446518 24459820 24551140 24556514 24586149 24613091 24619410
## [169] 24675360 24730559 24786898 24860117 24860190 24892809 24982172
## [176] 24985397 25015831 25022964 25043027 25143444 25153903 25172162
## [183] 25278425 25313073 25429071 25458701 25472995 25500143 25504587
## [190] 25505285 25533952 25543045 25573803 25599561 25617223 25617473
## [197] 25660244 25661185 25676669 25683166 25693767 25714839 25714983
## [204] 25739051 25746970 25786179 25795135 25807992 25808546 25814531
## [211] 25829541 25847973 25857925 25870243 25930197 25941360 25944738
## [218] 25997709 26011644 26018806 26044350 26047701 26079807 26092469
## [225] 26100084 26200013 26230091 26243875 26244295 26249267 26251474
## [232] 26272855 26277897 26310268 26322482 26342029 26352378 26363151
## [239] 26365875 26386119 26428949 26473724 26489077 26490658 26514775
## [246] 26518437 26566691 26598236 26602156 26603638 26771472 26802241
## [253] 26811463 26872602 26872699 26908008 26943125 26987428 26991049
## [260] 27053610 27091974 27126934 27155324 27184849 27453471 27608599
## [267] 27626380
## 
## 
## $`Gene Ontology`
## An object of class "aafList"
## [[1]]
## An object of class "aafGO"
## [[1]][[1]]
## An object of class "aafGOItem"
## @id   "GO:0002281"
## @name "macrophage activation involved in immune response"
## @type "Biological Process"
## @evid "IDA"
## 
## [[1]][[2]]
## An object of class "aafGOItem"
## @id   "GO:0002282"
## @name "microglial cell activation involved in immune response"
## @type "Biological Process"
## @evid "IDA"
## 
## [[1]][[3]]
## An object of class "aafGOItem"
## @id   "GO:0002686"
## @name "negative regulation of leukocyte migration"
## @type "Biological Process"
## @evid "IGI"
## 
## [[1]][[4]]
## An object of class "aafGOItem"
## @id   "GO:0002826"
## @name "negative regulation of T-helper 1 type immune response"
## @type "Biological Process"
## @evid "IGI"
## 
## [[1]][[5]]
## An object of class "aafGOItem"
## @id   "GO:0002830"
## @name "positive regulation of type 2 immune response"
## @type "Biological Process"
## @evid "IDA"
## 
## [[1]][[6]]
## An object of class "aafGOItem"
## @id   "GO:0002830"
## @name "positive regulation of type 2 immune response"
## @type "Biological Process"
## @evid "IGI"
## 
## [[1]][[7]]
## An object of class "aafGOItem"
## @id   "GO:0005125"
## @name "cytokine activity"
## @type "Molecular Function"
## @evid "IDA"
## 
## [[1]][[8]]
## An object of class "aafGOItem"
## @id   "GO:0005125"
## @name "cytokine activity"
## @type "Molecular Function"
## @evid "IPI"
## 
## [[1]][[9]]
## An object of class "aafGOItem"
## @id   "GO:0005125"
## @name "cytokine activity"
## @type "Molecular Function"
## @evid "ISO"
## 
## [[1]][[10]]
## An object of class "aafGOItem"
## @id   "GO:0005576"
## @name "extracellular region"
## @type "Cellular Component"
## @evid "IEA"
## 
## [[1]][[11]]
## An object of class "aafGOItem"
## @id   "GO:0005615"
## @name "extracellular space"
## @type "Cellular Component"
## @evid "IEA"
## 
## [[1]][[12]]
## An object of class "aafGOItem"
## @id   "GO:0005634"
## @name "nucleus"
## @type "Cellular Component"
## @evid "ISO"
## 
## [[1]][[13]]
## An object of class "aafGOItem"
## @id   "GO:0005654"
## @name "nucleoplasm"
## @type "Cellular Component"
## @evid "ISO"
## 
## [[1]][[14]]
## An object of class "aafGOItem"
## @id   "GO:0005694"
## @name "chromosome"
## @type "Cellular Component"
## @evid "IEA"
## 
## [[1]][[15]]
## An object of class "aafGOItem"
## @id   "GO:0005829"
## @name "cytosol"
## @type "Cellular Component"
## @evid "ISO"
## 
## [[1]][[16]]
## An object of class "aafGOItem"
## @id   "GO:0006351"
## @name "transcription, DNA-templated"
## @type "Biological Process"
## @evid "IEA"
## 
## [[1]][[17]]
## An object of class "aafGOItem"
## @id   "GO:0010628"
## @name "positive regulation of gene expression"
## @type "Biological Process"
## @evid "IDA"
## 
## [[1]][[18]]
## An object of class "aafGOItem"
## @id   "GO:0031410"
## @name "cytoplasmic vesicle"
## @type "Cellular Component"
## @evid "IEA"
## 
## [[1]][[19]]
## An object of class "aafGOItem"
## @id   "GO:0032436"
## @name "positive regulation of proteasomal ubiquitin-dependent protein catabolic process"
## @type "Biological Process"
## @evid "IDA"
## 
## [[1]][[20]]
## An object of class "aafGOItem"
## @id   "GO:0032436"
## @name "positive regulation of proteasomal ubiquitin-dependent protein catabolic process"
## @type "Biological Process"
## @evid "IGI"
## 
## [[1]][[21]]
## An object of class "aafGOItem"
## @id   "GO:0032689"
## @name "negative regulation of interferon-gamma production"
## @type "Biological Process"
## @evid "IGI"
## 
## [[1]][[22]]
## An object of class "aafGOItem"
## @id   "GO:0032736"
## @name "positive regulation of interleukin-13 production"
## @type "Biological Process"
## @evid "IGI"
## 
## [[1]][[23]]
## An object of class "aafGOItem"
## @id   "GO:0032753"
## @name "positive regulation of interleukin-4 production"
## @type "Biological Process"
## @evid "IGI"
## 
## [[1]][[24]]
## An object of class "aafGOItem"
## @id   "GO:0032754"
## @name "positive regulation of interleukin-5 production"
## @type "Biological Process"
## @evid "IGI"
## 
## [[1]][[25]]
## An object of class "aafGOItem"
## @id   "GO:0032755"
## @name "positive regulation of interleukin-6 production"
## @type "Biological Process"
## @evid "IGI"
## 
## [[1]][[26]]
## An object of class "aafGOItem"
## @id   "GO:0043032"
## @name "positive regulation of macrophage activation"
## @type "Biological Process"
## @evid "IDA"
## 
## [[1]][[27]]
## An object of class "aafGOItem"
## @id   "GO:0043032"
## @name "positive regulation of macrophage activation"
## @type "Biological Process"
## @evid "ISO"
## 
## [[1]][[28]]
## An object of class "aafGOItem"
## @id   "GO:0045944"
## @name "positive regulation of transcription from RNA polymerase II promoter"
## @type "Biological Process"
## @evid "IDA"
## 
## [[1]][[29]]
## An object of class "aafGOItem"
## @id   "GO:0050729"
## @name "positive regulation of inflammatory response"
## @type "Biological Process"
## @evid "IDA"
## 
## [[1]][[30]]
## An object of class "aafGOItem"
## @id   "GO:0051024"
## @name "positive regulation of immunoglobulin secretion"
## @type "Biological Process"
## @evid "IGI"
## 
## [[1]][[31]]
## An object of class "aafGOItem"
## @id   "GO:0051025"
## @name "negative regulation of immunoglobulin secretion"
## @type "Biological Process"
## @evid "IGI"
## 
## [[1]][[32]]
## An object of class "aafGOItem"
## @id   "GO:0051607"
## @name "defense response to virus"
## @type "Biological Process"
## @evid "IDA"
## 
## [[1]][[33]]
## An object of class "aafGOItem"
## @id   "GO:0061518"
## @name "microglial cell proliferation"
## @type "Biological Process"
## @evid "IDA"
## 
## [[1]][[34]]
## An object of class "aafGOItem"
## @id   "GO:0090197"
## @name "positive regulation of chemokine secretion"
## @type "Biological Process"
## @evid "IDA"
## 
## [[1]][[35]]
## An object of class "aafGOItem"
## @id   "GO:0090197"
## @name "positive regulation of chemokine secretion"
## @type "Biological Process"
## @evid "ISO"
## 
## [[1]][[36]]
## An object of class "aafGOItem"
## @id   "GO:0097191"
## @name "extrinsic apoptotic signaling pathway"
## @type "Biological Process"
## @evid "IGI"
## 
## 
## 
## $Pathway
## An object of class "aafList"
## [[1]]
## An object of class "aafPathway"
## [[1]][[1]]
## An object of class "aafGOItem"
## @id     "04623"
## @name   "Cytosolic DNA-sensing pathway"
## @enzyme ""

Annnotation 3/3

Annotation

Annotation

Exercise

Generate separate annotations for the up-regulated and down-regulated differentially expressed genes between WT and KO. Discuss each set in a molecular context as detailed as possible.

Solution 1/2

Generating two lists of IDs for the up- and down- regulated genes

ids = row.names(data2) # All the IDs
length(ids)
## [1] 45101
up_ids = ids[filter_up] # IDs of up-regulated genes
length(up_ids)
## [1] 95
up_ids
##  [1] "1416200_at"   "1416236_a_at" "1418203_at"   "1419394_s_at"
##  [5] "1419427_at"   "1419591_at"   "1420006_at"   "1420064_s_at"
##  [9] "1420431_at"   "1420753_at"   "1421156_a_at" "1421346_a_at"
## [13] "1421456_at"   "1421551_s_at" "1421676_at"   "1422588_at"  
## [17] "1422672_at"   "1422783_a_at" "1422784_at"   "1422892_s_at"
## [21] "1422983_at"   "1423271_at"   "1425002_at"   "1425452_s_at"
## [25] "1426252_a_at" "1426600_at"   "1426911_at"   "1427700_x_at"
## [29] "1428088_at"   "1429408_at"   "1430800_at"   "1431093_at"  
## [33] "1431609_a_at" "1431654_at"   "1432742_at"   "1433005_at"  
## [37] "1434346_at"   "1435639_at"   "1437258_at"   "1437518_at"  
## [41] "1438219_at"   "1438850_at"   "1439016_x_at" "1439149_s_at"
## [45] "1439347_at"   "1439458_x_at" "1440147_at"   "1440589_at"  
## [49] "1441011_at"   "1441197_at"   "1441596_at"   "1441772_at"  
## [53] "1442936_at"   "1444254_at"   "1444650_at"   "1445310_at"  
## [57] "1445349_at"   "1445378_at"   "1446498_at"   "1446858_at"  
## [61] "1446976_at"   "1447722_at"   "1448265_x_at" "1448756_at"  
## [65] "1448932_at"   "1449417_at"   "1449555_a_at" "1449873_at"  
## [69] "1449994_at"   "1450004_at"   "1450019_at"   "1450571_a_at"
## [73] "1450618_a_at" "1450811_at"   "1451006_at"   "1451453_at"  
## [77] "1451594_s_at" "1451757_at"   "1452487_x_at" "1453238_s_at"
## [81] "1453326_at"   "1453916_at"   "1455872_at"   "1456877_at"  
## [85] "1457016_at"   "1457550_at"   "1457666_s_at" "1458085_at"  
## [89] "1458147_at"   "1459227_at"   "1459306_at"   "1459713_s_at"
## [93] "1459982_a_at" "1460372_at"   "1460514_s_at"
down_ids = ids[filter_down] # IDs of down-regulared genes
length(down_ids)
## [1] 181
down_ids
##   [1] "1417808_at"   "1417932_at"   "1418050_at"   "1418100_at"  
##   [5] "1418213_at"   "1418266_at"   "1418301_at"   "1418609_at"  
##   [9] "1418645_at"   "1418855_at"   "1419317_x_at" "1419409_at"  
##  [13] "1419431_at"   "1419532_at"   "1419588_at"   "1419731_at"  
##  [17] "1420141_at"   "1420183_at"   "1420350_at"   "1420550_at"  
##  [21] "1420677_x_at" "1420741_x_at" "1421091_at"   "1421092_at"  
##  [25] "1421473_at"   "1421500_at"   "1422196_at"   "1422222_at"  
##  [29] "1422583_at"   "1422820_at"   "1422871_at"   "1422876_at"  
##  [33] "1422904_at"   "1423320_at"   "1423476_at"   "1423542_at"  
##  [37] "1423634_at"   "1424090_at"   "1424263_at"   "1424409_at"  
##  [41] "1424683_at"   "1425321_a_at" "1426337_a_at" "1426507_at"  
##  [45] "1426511_at"   "1427082_at"   "1428355_at"   "1428662_a_at"
##  [49] "1428882_at"   "1428980_at"   "1429565_s_at" "1429676_at"  
##  [53] "1429702_at"   "1429802_at"   "1429869_at"   "1429964_at"  
##  [57] "1430087_at"   "1430256_at"   "1430306_a_at" "1430513_at"  
##  [61] "1430594_at"   "1430661_at"   "1430715_at"   "1430730_at"  
##  [65] "1430952_at"   "1431426_at"   "1431554_a_at" "1431754_at"  
##  [69] "1431821_a_at" "1431829_a_at" "1431936_a_at" "1432058_at"  
##  [73] "1432558_a_at" "1432787_at"   "1432976_at"   "1433226_at"  
##  [77] "1433801_at"   "1433858_at"   "1433923_at"   "1434083_a_at"
##  [81] "1434092_at"   "1434146_at"   "1434271_at"   "1434725_at"  
##  [85] "1434917_at"   "1435191_at"   "1435462_at"   "1435482_at"  
##  [89] "1435541_at"   "1435572_at"   "1436614_at"   "1436917_s_at"
##  [93] "1436976_a_at" "1437145_s_at" "1437269_at"   "1437662_at"  
##  [97] "1437672_at"   "1437866_at"   "1437893_at"   "1438665_at"  
## [101] "1438849_at"   "1439836_at"   "1439878_at"   "1440887_at"  
## [105] "1441292_at"   "1441610_at"   "1441672_at"   "1441745_at"  
## [109] "1441793_at"   "1441981_at"   "1442209_at"   "1442715_at"  
## [113] "1443163_at"   "1443292_at"   "1443974_at"   "1443997_at"  
## [117] "1444522_at"   "1445368_at"   "1445408_at"   "1445438_at"  
## [121] "1445983_at"   "1446110_at"   "1446120_at"   "1446246_at"  
## [125] "1446249_at"   "1446748_at"   "1446869_at"   "1447616_at"  
## [129] "1447669_s_at" "1448241_at"   "1448613_at"   "1448745_s_at"
## [133] "1448952_at"   "1449237_at"   "1449367_at"   "1449959_x_at"
## [137] "1450251_a_at" "1450470_at"   "1450505_a_at" "1450539_at"  
## [141] "1450748_at"   "1450936_a_at" "1451050_at"   "1451054_at"  
## [145] "1451258_at"   "1451287_s_at" "1451601_a_at" "1451613_at"  
## [149] "1452543_a_at" "1453092_at"   "1453218_at"   "1453257_at"  
## [153] "1453435_a_at" "1453488_at"   "1453568_at"   "1453620_at"  
## [157] "1453820_at"   "1454058_at"   "1454632_at"   "1454762_at"  
## [161] "1455324_at"   "1455825_s_at" "1455889_at"   "1456183_at"  
## [165] "1456211_at"   "1456481_at"   "1456512_at"   "1456784_at"  
## [169] "1457254_x_at" "1457433_x_at" "1457518_at"   "1457557_at"  
## [173] "1458367_at"   "1458519_at"   "1458533_at"   "1459037_at"  
## [177] "1459647_at"   "1459833_x_at" "1459898_at"   "1460624_at"  
## [181] "1460674_at"

Solution 2/2

Obtaining the annotation of the up- and down- regulated genes into tables

up_table = aafTableAnn(up_ids, "mouse4302.db")
head(up_table)
saveHTML(up_table, file="up_table.html")
browseURL("up_table.html")

down_table = aafTableAnn(down_ids, "mouse4302.db")
head(down_table)
saveHTML(down_table, file="down_table.html")
browseURL("down_table.html")
## An object of class "aafTable"
## Slot "probeids":
## [1] "1416200_at"
## 
## Slot "table":
## $Probe
## An object of class "aafList"
## [[1]]
## An object of class "aafProbe"
## [1] "1416200_at"
## 
## 
## $Symbol
## An object of class "aafList"
## [[1]]
## [1] "Il33"
## attr(,"class")
## [1] "aafSymbol"
## 
## 
## $Description
## An object of class "aafList"
## [[1]]
## [1] "interleukin 33"
## attr(,"class")
## [1] "aafDescription"
## 
## 
## $Chromosome
## An object of class "aafList"
## [[1]]
## [1] "19"
## attr(,"class")
## [1] "aafChromosome"
## 
## 
## $`Chromosome Location`
## An object of class "aafList"
## [[1]]
## An object of class "aafChromLoc"
## [1] 29945789 29925113
## 
## 
## $GenBank
## An object of class "aafList"
## [[1]]
## [1] "NM_133775"
## attr(,"class")
## [1] "aafGenBank"
## 
## 
## $Gene
## An object of class "aafList"
## [[1]]
## An object of class "aafLocusLink"
## [1] 77125
## 
## 
## $UniGene
## An object of class "aafList"
## [[1]]
## [1] "Mm.182359"
## attr(,"class")
## [1] "aafUniGene"
## 
## 
## $PubMed
## An object of class "aafList"
## [[1]]
## An object of class "aafPubMed"
##   [1]  8889548 10349636 11042159 11076861 11217851 12466851 12477932
##   [8] 12819012 15475267 15489334 16141072 16141073 16286016 16602821
##  [15] 17185418 17492053 17623648 17675517 17881510 18003919 18023358
##  [22] 18250453 18268038 18450470 18552204 18566365 18603409 18667700
##  [29] 18799693 18802081 19248109 19451398 19465481 19508382 19553541
##  [36] 19559631 19661270 19666510 19684081 19750479 19841166 19892870
##  [43] 19933859 19950183 20035719 20042577 20385815 20412815 20427273
##  [50] 20501612 20534524 20634488 20689814 20693421 20937871 20939024
##  [57] 21190867 21220696 21239713 21239718 21267068 21268000 21281751
##  [64] 21308681 21349253 21357533 21422152 21454686 21469105 21494550
##  [71] 21515798 21557213 21642589 21646790 21677750 21703183 21703403
##  [78] 21734074 21797940 21835205 21887788 21945667 21949025 21949094
##  [85] 21972019 22013230 22013485 22119406 22142849 22180658 22198948
##  [92] 22258632 22267218 22270365 22294690 22307629 22323740 22329990
##  [99] 22331917 22370606 22371395 22460070 22542450 22585447 22634619
## [106] 22660580 22661085 22665806 22686327 22689946 22702477 22782692
## [113] 22802353 22922818 23006545 23071771 23093619 23132931 23148283
## [120] 23162017 23169007 23248269 23300625 23323935 23324173 23347081
## [127] 23363980 23397250 23403558 23418608 23496815 23499895 23523996
## [134] 23547117 23582173 23585480 23630360 23662055 23683462 23733876
## [141] 23810766 23837438 23892028 23894196 23911389 23918359 23945235
## [148] 23954132 23960191 24028396 24043894 24045639 24058536 24076135
## [155] 24076431 24105680 24194600 24205109 24220317 24257755 24356538
## [162] 24446518 24459820 24551140 24556514 24586149 24613091 24619410
## [169] 24675360 24730559 24786898 24860117 24860190 24892809 24982172
## [176] 24985397 25015831 25022964 25043027 25143444 25153903 25172162
## [183] 25278425 25313073 25429071 25458701 25472995 25500143 25504587
## [190] 25505285 25533952 25543045 25573803 25599561 25617223 25617473
## [197] 25660244 25661185 25676669 25683166 25693767 25714839 25714983
## [204] 25739051 25746970 25786179 25795135 25807992 25808546 25814531
## [211] 25829541 25847973 25857925 25870243 25930197 25941360 25944738
## [218] 25997709 26011644 26018806 26044350 26047701 26079807 26092469
## [225] 26100084 26200013 26230091 26243875 26244295 26249267 26251474
## [232] 26272855 26277897 26310268 26322482 26342029 26352378 26363151
## [239] 26365875 26386119 26428949 26473724 26489077 26490658 26514775
## [246] 26518437 26566691 26598236 26602156 26603638 26771472 26802241
## [253] 26811463 26872602 26872699 26908008 26943125 26987428 26991049
## [260] 27053610 27091974 27126934 27155324 27184849 27453471 27608599
## [267] 27626380
## 
## 
## $`Gene Ontology`
## An object of class "aafList"
## [[1]]
## An object of class "aafGO"
## [[1]][[1]]
## An object of class "aafGOItem"
## @id   "GO:0002281"
## @name "macrophage activation involved in immune response"
## @type "Biological Process"
## @evid "IDA"
## 
## [[1]][[2]]
## An object of class "aafGOItem"
## @id   "GO:0002282"
## @name "microglial cell activation involved in immune response"
## @type "Biological Process"
## @evid "IDA"
## 
## [[1]][[3]]
## An object of class "aafGOItem"
## @id   "GO:0002686"
## @name "negative regulation of leukocyte migration"
## @type "Biological Process"
## @evid "IGI"
## 
## [[1]][[4]]
## An object of class "aafGOItem"
## @id   "GO:0002826"
## @name "negative regulation of T-helper 1 type immune response"
## @type "Biological Process"
## @evid "IGI"
## 
## [[1]][[5]]
## An object of class "aafGOItem"
## @id   "GO:0002830"
## @name "positive regulation of type 2 immune response"
## @type "Biological Process"
## @evid "IDA"
## 
## [[1]][[6]]
## An object of class "aafGOItem"
## @id   "GO:0002830"
## @name "positive regulation of type 2 immune response"
## @type "Biological Process"
## @evid "IGI"
## 
## [[1]][[7]]
## An object of class "aafGOItem"
## @id   "GO:0005125"
## @name "cytokine activity"
## @type "Molecular Function"
## @evid "IDA"
## 
## [[1]][[8]]
## An object of class "aafGOItem"
## @id   "GO:0005125"
## @name "cytokine activity"
## @type "Molecular Function"
## @evid "IPI"
## 
## [[1]][[9]]
## An object of class "aafGOItem"
## @id   "GO:0005125"
## @name "cytokine activity"
## @type "Molecular Function"
## @evid "ISO"
## 
## [[1]][[10]]
## An object of class "aafGOItem"
## @id   "GO:0005576"
## @name "extracellular region"
## @type "Cellular Component"
## @evid "IEA"
## 
## [[1]][[11]]
## An object of class "aafGOItem"
## @id   "GO:0005615"
## @name "extracellular space"
## @type "Cellular Component"
## @evid "IEA"
## 
## [[1]][[12]]
## An object of class "aafGOItem"
## @id   "GO:0005634"
## @name "nucleus"
## @type "Cellular Component"
## @evid "ISO"
## 
## [[1]][[13]]
## An object of class "aafGOItem"
## @id   "GO:0005654"
## @name "nucleoplasm"
## @type "Cellular Component"
## @evid "ISO"
## 
## [[1]][[14]]
## An object of class "aafGOItem"
## @id   "GO:0005694"
## @name "chromosome"
## @type "Cellular Component"
## @evid "IEA"
## 
## [[1]][[15]]
## An object of class "aafGOItem"
## @id   "GO:0005829"
## @name "cytosol"
## @type "Cellular Component"
## @evid "ISO"
## 
## [[1]][[16]]
## An object of class "aafGOItem"
## @id   "GO:0006351"
## @name "transcription, DNA-templated"
## @type "Biological Process"
## @evid "IEA"
## 
## [[1]][[17]]
## An object of class "aafGOItem"
## @id   "GO:0010628"
## @name "positive regulation of gene expression"
## @type "Biological Process"
## @evid "IDA"
## 
## [[1]][[18]]
## An object of class "aafGOItem"
## @id   "GO:0031410"
## @name "cytoplasmic vesicle"
## @type "Cellular Component"
## @evid "IEA"
## 
## [[1]][[19]]
## An object of class "aafGOItem"
## @id   "GO:0032436"
## @name "positive regulation of proteasomal ubiquitin-dependent protein catabolic process"
## @type "Biological Process"
## @evid "IDA"
## 
## [[1]][[20]]
## An object of class "aafGOItem"
## @id   "GO:0032436"
## @name "positive regulation of proteasomal ubiquitin-dependent protein catabolic process"
## @type "Biological Process"
## @evid "IGI"
## 
## [[1]][[21]]
## An object of class "aafGOItem"
## @id   "GO:0032689"
## @name "negative regulation of interferon-gamma production"
## @type "Biological Process"
## @evid "IGI"
## 
## [[1]][[22]]
## An object of class "aafGOItem"
## @id   "GO:0032736"
## @name "positive regulation of interleukin-13 production"
## @type "Biological Process"
## @evid "IGI"
## 
## [[1]][[23]]
## An object of class "aafGOItem"
## @id   "GO:0032753"
## @name "positive regulation of interleukin-4 production"
## @type "Biological Process"
## @evid "IGI"
## 
## [[1]][[24]]
## An object of class "aafGOItem"
## @id   "GO:0032754"
## @name "positive regulation of interleukin-5 production"
## @type "Biological Process"
## @evid "IGI"
## 
## [[1]][[25]]
## An object of class "aafGOItem"
## @id   "GO:0032755"
## @name "positive regulation of interleukin-6 production"
## @type "Biological Process"
## @evid "IGI"
## 
## [[1]][[26]]
## An object of class "aafGOItem"
## @id   "GO:0043032"
## @name "positive regulation of macrophage activation"
## @type "Biological Process"
## @evid "IDA"
## 
## [[1]][[27]]
## An object of class "aafGOItem"
## @id   "GO:0043032"
## @name "positive regulation of macrophage activation"
## @type "Biological Process"
## @evid "ISO"
## 
## [[1]][[28]]
## An object of class "aafGOItem"
## @id   "GO:0045944"
## @name "positive regulation of transcription from RNA polymerase II promoter"
## @type "Biological Process"
## @evid "IDA"
## 
## [[1]][[29]]
## An object of class "aafGOItem"
## @id   "GO:0050729"
## @name "positive regulation of inflammatory response"
## @type "Biological Process"
## @evid "IDA"
## 
## [[1]][[30]]
## An object of class "aafGOItem"
## @id   "GO:0051024"
## @name "positive regulation of immunoglobulin secretion"
## @type "Biological Process"
## @evid "IGI"
## 
## [[1]][[31]]
## An object of class "aafGOItem"
## @id   "GO:0051025"
## @name "negative regulation of immunoglobulin secretion"
## @type "Biological Process"
## @evid "IGI"
## 
## [[1]][[32]]
## An object of class "aafGOItem"
## @id   "GO:0051607"
## @name "defense response to virus"
## @type "Biological Process"
## @evid "IDA"
## 
## [[1]][[33]]
## An object of class "aafGOItem"
## @id   "GO:0061518"
## @name "microglial cell proliferation"
## @type "Biological Process"
## @evid "IDA"
## 
## [[1]][[34]]
## An object of class "aafGOItem"
## @id   "GO:0090197"
## @name "positive regulation of chemokine secretion"
## @type "Biological Process"
## @evid "IDA"
## 
## [[1]][[35]]
## An object of class "aafGOItem"
## @id   "GO:0090197"
## @name "positive regulation of chemokine secretion"
## @type "Biological Process"
## @evid "ISO"
## 
## [[1]][[36]]
## An object of class "aafGOItem"
## @id   "GO:0097191"
## @name "extrinsic apoptotic signaling pathway"
## @type "Biological Process"
## @evid "IGI"
## 
## 
## 
## $Pathway
## An object of class "aafList"
## [[1]]
## An object of class "aafPathway"
## [[1]][[1]]
## An object of class "aafGOItem"
## @id     "04623"
## @name   "Cytosolic DNA-sensing pathway"
## @enzyme ""
## An object of class "aafTable"
## Slot "probeids":
## [1] "1417808_at"
## 
## Slot "table":
## $Probe
## An object of class "aafList"
## [[1]]
## An object of class "aafProbe"
## [1] "1417808_at"
## 
## 
## $Symbol
## An object of class "aafList"
## [[1]]
## [1] "2310050C09Rik"
## attr(,"class")
## [1] "aafSymbol"
## 
## 
## $Description
## An object of class "aafList"
## [[1]]
## [1] "RIKEN cDNA 2310050C09 gene"
## attr(,"class")
## [1] "aafDescription"
## 
## 
## $Chromosome
## An object of class "aafList"
## [[1]]
## [1] "3"
## attr(,"class")
## [1] "aafChromosome"
## 
## 
## $`Chromosome Location`
## An object of class "aafList"
## [[1]]
## An object of class "aafChromLoc"
## [1] -92868358
## 
## 
## $GenBank
## An object of class "aafList"
## [[1]]
## [1] "NM_025621"
## attr(,"class")
## [1] "aafGenBank"
## 
## 
## $Gene
## An object of class "aafList"
## [[1]]
## An object of class "aafLocusLink"
## [1] 66533
## 
## 
## $UniGene
## An object of class "aafList"
## [[1]]
## [1] "Mm.144259"
## attr(,"class")
## [1] "aafUniGene"
## 
## 
## $PubMed
## An object of class "aafList"
## [[1]]
## An object of class "aafPubMed"
## [1] 10349636 11042159 11076861 11217851 12466851 16141072 16141073 21267068
## 
## 
## $`Gene Ontology`
## An object of class "aafList"
## [[1]]
## An object of class "aafGO"
## [[1]][[1]]
## An object of class "aafGOItem"
## @id   "GO:0001533"
## @name "cornified envelope"
## @type "Cellular Component"
## @evid "IBA"
## 
## [[1]][[2]]
## An object of class "aafGOItem"
## @id   "GO:0005198"
## @name "structural molecule activity"
## @type "Molecular Function"
## @evid "IBA"
## 
## [[1]][[3]]
## An object of class "aafGOItem"
## @id   "GO:0005737"
## @name "cytoplasm"
## @type "Cellular Component"
## @evid "IBA"
## 
## [[1]][[4]]
## An object of class "aafGOItem"
## @id   "GO:0018149"
## @name "peptide cross-linking"
## @type "Biological Process"
## @evid "IBA"
## 
## [[1]][[5]]
## An object of class "aafGOItem"
## @id   "GO:0030216"
## @name "keratinocyte differentiation"
## @type "Biological Process"
## @evid "IBA"
## 
## [[1]][[6]]
## An object of class "aafGOItem"
## @id   "GO:0070062"
## @name "extracellular exosome"
## @type "Cellular Component"
## @evid "ISO"
## 
## 
## 
## $Pathway
## An object of class "aafList"
## [[1]]
## An object of class "aafPathway"
## list()

Homework

Thank you!